Skip to main content

View Post [edit]

Poster: archotto Date: Apr 8, 2021 2:02am
Forum: texts Subject: Re: OCR does not any longer include necessary fonts

Thank you for the hint! The problem is, that not all volumes offer FULL TEXT. And when you e.g. compare the original text with the full text of vol. 3 in the List of Wikipedia "Schweickhardt", you will see, that the OCR, which produced the full text, did not work well at all. So search is biased as well. Of course the scan quality of certain pages is low, but over all it is a matter of OCR quality (both FULL TEXT and search). This is a real pity! If there is any solid solution, please help!

Reply [edit]

Poster: Jeff Kaplan Date: Apr 8, 2021 9:16am
Forum: texts Subject: Re: OCR does not any longer include necessary fonts

we did not scan those so nothing we can do there.

which volumes do not offer full text. please post links.

Reply [edit]

Poster: archotto Date: Apr 9, 2021 4:06am
Forum: texts Subject: Re: OCR does not any longer include necessary fonts

FullText is missing: Bd. 2: https://archive.org/details/darstellungdese25schwgoog

Bd. 4: https://books.google.at/books?id=e54r-t9LPf8C&redir_esc=y&hl=de

But please compare the original with the full text (beginning of Bd 1,) :
https://archive.org/details/darstellungdese16schwgoog/page/n13/mode/2up

9ßä>tenb einet Seit Don 10 Sa&ren , al$ ber
SJerfaffer fein 8iebtfng8fhtbium, bie öfterreiä)ifcbe
©efc&ic&te , unauegefefct eifrig betrieb , fyatte er
bie befte (Gelegenheit , alte SBerf e , bie über bie*
fed Sanb bortyanben finb , genau fennen p tcr=
nen, unb erfab bierauS, bajl (wßbrenb anbere
9>röbin$en beS öffccrrctc^ifdf)cn Äaiferjtaate8 in
neueren «Seiten SRänner gefunben , bon welchen
nfi|ftcf>e unb brauchbare' geograpbifc& 5 ftatifttfc&e
arbeiten erfebienen) ha$ SBiegenfanb beS mäfyti'
gen ©taatenförperl, ba3 @rjberjogt^um £>efter*
teiefc unter ber €it6, fein einziges SSÖerE auf jus
weifen bot/ weftbea bae 8anb im 2C Hg enteis
nett, gteicfmne burefc einzelne £)rt8befc&reis
bung/ umfaffenb (nämtfefc bon feinem @ntfle*
ben an ununterbrochen) batgefieHt fyättt ; benn
außer einigen fttetn , $um £betf fefron unbrauefc*
bar geworbenen, bloß topograpbtfc&en Söer«
fen, unb ber »erbienjiuoHen Söearbei*
t u n g ber l irä)tt#en £opograpbie / bie aber bi$
iefct nur einzelne £)ecanate balb auö biefem, halb
au8 jenem Giertet befebrieb, unb beren ©nbe

Unfortunately I cannot include a jpg (mail does not work).
As you can see this FULL TEXT does not help at all, as hardly any word is spelled correctly and therefor cannot be located by search!

Best regards
Otto This post was modified by archotto on 2021-04-09 11:06:34

Reply [edit]

Poster: Jeff Kaplan Date: Apr 9, 2021 10:12am
Forum: texts Subject: Re: OCR does not any longer include necessary fonts

i'm re-running that. we only recently added the ability to OCR fraktur fonts